Appropriate use of Color

PH345: Winter 2025

Phil Boonstra

Case Study: FARS 2022

Fatal Accident Reporting System (FARS)

  • Data from National Highway Traffic Safety Administration (NHTSA) on fatal traffic accidents in the US

  • Collected and reported annually from 1975. Most recent year available is 2022

  • Data are available at https://www.nhtsa.gov/file-downloads?p=nhtsa/downloads/FARS/2022/National/

  • For each state, I calculated the fraction of accidents that occurred in daylight and fraction of accidents that occurred in clear weather and compare across states.

Three big problems with this plot:

  1. Too many colors to be helpful
  2. Color scheme encourages comparisons between alphabetically adjacent states
  3. Lots of plot space taken up by the legend

What previous plot looks like to someone with 80% deuteranopia (can’t see green well)

https://bioapps.byu.edu/colorblind_image_tester

Grouping by region instead of individual states

Labeling outlying states explicitly using ggrepel R package (Slowikowski, 2024)

  1. Clearly something wrong with the data for Virgina regarding daylight accidents. Kansas and Vermont also worth closer look (dropped from plot below)
  2. In US Northeast and South, accidents occurred more often in non-daylight hours and clear weather; in US Midwest, in daylight hours; in US West, there is variability across states

Defining Color

Color can be defined by its hue (the defining attribute, e.g. blue or red), lightness (the brightness), and chroma (the richness of a color),

  • Three hues (red, green, blue)
  • Three lightnesses (top [not bright], middle, bottom [bright])
  • Ten chromas (left [not intense] to right [intense])

Cynthia Brewer

American cartographer and professor of geography at Penn State University

Pioneering work in developing color schemes for maps (https://colorbrewer2.org)

Recipient of the Carl Mannerfelt Gold Medal in 2023

Color scales

Instead of choosing individual colors, typically use predefined ‘palette’ of colors. Three types of palettes:

  • Sequential: colors follow a gradient from low to high
  • Qualitative: hue-based palettes for categorical data
  • Diverging: two sequential palettes “pasted together”

Many palettes available in R, including ggplot2

https://colorbrewer2.org/

Default color scheme in ggplot2

library(datasauRus)
dino_plot <-
  ggplot(datasaurus_dozen) +
  geom_point(aes(x = x, y = y, color = dataset), size = 1) + 
  facet_wrap(vars(dataset), ncol = 5) +
  labs(x = NULL, y = NULL) + 
  guides(color = FALSE) +
  theme(text = element_text(size = 18)) 
dino_plot

Set3 palette (qualitative)

library(datasauRus)
dino_plot +
scale_color_brewer(palette = "Set3") 

Dark2 palette (qualitative)

library(datasauRus)
dino_plot +
scale_color_brewer(palette = "Dark2") 

Spectral palette (diverging)

library(datasauRus)
dino_plot +
scale_color_brewer(palette = "Spectral") 

BrBG palette (qualitative)

library(datasauRus)
dino_plot +
scale_color_brewer(palette = "BrBG") 

Misleading comparisons

Perception of color can vary. (a,b) The same color can look different (a), and different colors can appear to be nearly the same by changing the background color (b)1. (c) The rectangles in the heat map indicated by the asterisks (*) are the same color but appear to be different.

Figure 1 from Wong (2010a)

Common pitfalls / Recommendations

  • Ignoring color blindness

    • Use color-blind friendly color palettes when possible
  • Too Much Information

    • Use containment or other aesthetics to assist interpretation
    • Avoid using more than 6-8 colors in a plot (Wong, 2011)
  • Misleading comparisons

    • Viewers have difficulty mapping color changes to quantitative variables
  • Color scales

    • Consider how colors relate to each other, background

References

Slowikowski K, 2024. ggrepel: Automatically Position Non-Overlapping Text Labels with ‘ggplot2’. R package version 0.9.5, https://CRAN.R-project.org/package=ggrepel.

Wilke, C.O., 2019. Fundamentals of data visualization: a primer on making informative and compelling figures. O’Reilly Media.

Wong, B., 2010. Color coding. Nature Methods, 7(8), pp.573.

Wong, B., 2011. Color blindness. Nature Methods, 8(6), pp.441.